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1.  INTRODUCTION 


Stein  (1945)  describes  a  two-stage  procedure  to  obtain  a  fixed- 
width  confidence  interval  for  the  mean  of  a  normal  population  when 
the  variance  is  unknown.  This  is  followed  by  works  of  Anacombe 
(1953)  and  Chow  and  Robbins  (1965)  who  advocate  sequential  proce¬ 
dures.  Hall  (1981)  suggests  a  three-stage  sampling  technique  that 
combines  the  simplicity  of  Stein's  procedure  with  the  efficiency  of 
the  fully  sequential  method.  For  a  linear  model  Y^  =  X^B  ♦  where 
is  N(0,o2),  the  corresponding  problem  of  obtaining  a  fixed-width 
confidence  interval  for  one  of  the  parameters  is  more  difficult 
since  the  variance  of  the  usual  estimate  depends  not  only  on  a2  but 
also  on  the  X^.  To  avoid  this  difficulty,  Stein  (1945)  assumes 
that  Xi,  ...,  Xffl  are  fixed  and  that  they  are  repeated  as  a  whole, 
as  many  times  as  is  necessary.  For  example,  Xi.  ...»  X  may  corres- 

SI 

pond  to  an  orthogonal  design  which  we  are  replicating.  Bishop 
(1978)  continues  to  assume  that  the  Xi  are  fixed. 

In  this  paper,  we  consider  simple  linear  regression 
Y^  =  Y  +  611  +  where  is  N(0,a2)  and  X^  is  N(u,t2).  In  other 
words,  we  are  sampling  from  a  bivariate  normal  population.  In 
section  2,  we  describe  a  two-stage  procedure  to  obtain  a  fixed- 
width  confidence  interval  for  S  and  prove  that  the  specified 
coverage  probability  is  attained.  Essentially,  we  estimate  o2  and 
predict  Xr,  n  >  m  on  the  basis  of  a  pilot  sample  (X^Yi),  ..., 

(X  ,Y  )  to  determine  the  size  of  the  second  sample.  If  vie  sample 

mm 

sequentially,  then  there  is  no  need  to  predict  X  ,  n  >  m;  such  a 

n 

procedure  is  described  in  section  3-  We  show  that  the  corresponding 
confidence  interval  attains  the  specified  coverage  probability 
regardless  of  the  distribution  of  the  X^.  The  procedure  behaves 
like  Stein's  procedure  for  the  estimation  of  a  normal  mean.  By 
updating  the  estimate  of  a2  sequentially,  we  arrive  at  another 
procedure.  Section  4  deals  with  the  related  problem  of  deriving 
a  test  procedure  of  H  :  8  =  So  st  level  ctQ  which  has  power  at  least 
ai  at  8 s So + 4  independent  of  the  values  of  the  other  parameters. 

One  way  to  construct  such  a  test  makes  use  of  fixed-width  confidence 


intervals  for  0.  A  different  approach  which  treats  X^  and  Y^ 
symmetrically  is  based  on  the  distribution  of  the  sample  correlation 
coefficient.  We  show  that  the  resulting  test  attains  the  specified 
level  and  power  asymptotically. 

2.  A  TWO-STAGE  PROCEDURE 

Suppose  that  a2  is  known  and  the  are  known  constants,  then 
8n  is  N(B,c2/|(Xi-Xn)2)  where  §n  a  J(  ^  -  *n JY^C Xi  -  Xr  )2  is  the 
least  squares  estimate  of  0  based  on  (Xj,Yi),  ...»  (Xn»Yn).  It 
follows  that  P( 1 0n  -  0 1  <  d )  >  1  -  a  if 

5<W2  *zl«/2°z'i2  ■  so 

where  a^2  stands  for  the  ( 1  -  a/2 )  -  percentile  of  the  standard 

normal  distribution.  Since  o2  is  unknown  and  the  X^  are  stochastic, 

we  need  to  estimate  o2  and  predict  Xft,  n>n  on  the  basis  of  the 

pilot  sample  (X^Yj),  ...,  (X  jY^),  m>3-  An  obvious  estimate  of 

a2  is  o2  =  ?(Y.  -  y  -0  X.  )2/(m-2).  To  reduce  the  prediction 
in  2.  i  id  m  i  p 

problem,  we  note  that  we  only  need  to  predict  Z ( X.  -  X  )2  for  n  >  n. 

1  ^  n 

Since  X^  is  N(p,t2),  ive  make  the  Helmert  transformation  to  obtain 

fix.  -X  )2  =  t2(U|  +  ...  +  U2)  and  Z(X.  -X  )2  =  t2(u|  +  ...  +  U2  +  ...  + 
lira  m  i  i  n  m 

U2)  for  n>m  where  U2,  U3,  ...  are  independent  standard  normal 

variables.  This  allows  us  to  make  use  of  standard  results  of 

prediction  for  the  gamma  case.  In  particular,  if  bn *  1  +  c(n-m)/ 

X2(m)  where  x^_c(n~n)  and  X2(®)  are  chi-square  percentiles,  then 

for  each  n>m,  ( b  ?( X,  -X  )^,«)  is  a  (c,g)  guaranteed  coverage 
nl  in  m 

interval  predictor  of  £( X^  -  X^ ) 2  (Aitchison  &  Dunsmore  1975,  Ch.6). 
Furthermore,  we  can  guarantee  coverage  simultaneously  so  that  with 
probability  g,  the  pilot  sample  X^,  ...,  Xffl  is  such  that 
P(?(X. -X  )2>bf(X,  -X  )2|Xi,...,X  )  >  c  for  each  n  >  m.  We  choose 
c,  g  so  that  eg  > 1  -  a  and  define  a'  by  1  -  a  =  cg(l-a').  For  con- 
veience,  we  let  bm=l.  Consider  the  following  two-stage  sampling 
procedure . 

Procedure  1.  (i)  Obtain  a  pilot  sample  (Xi,Yi),  ...»  (X  ,Y  )  and 

mm 

calculate  y  >  and  a2. 


(ii)  Draw  a  second  sample  of  size  Nj  -m  where  Nj  is  the  smallest 
nam  such  that  bn£( Xi  -  >  t2^ [m-2] c£/d2  and  \_a'/2lm~2]  is 

the  ( l-«  '/2 )-  percentile  of  a  t  distribution  with  m-2  degrees  of 
freedom. 

The  following  theorem  says  that  (8^  - d,BN  +d)  is  a  (l-o)  - 
level  confidence  interval  for  6. 

Theorem.  1.  P(  1 8^  -  8|  <  d)  >  1  -  a. 

Before  we  prove  theorem  1,  we  first  state  two  lemmas. 

Lemma  1.  The  conditional  distribution  of  8^  given  5^  and 
Xi,  X2,  ...  is  N(6,o2/Si)  where  SA  =  |1(X±  -  35^^  )2. 

Proof.  Given  Xj,  X2,  ...>  Nx  depends  only  on  a2  and  can  be 

m  Nx 

written  as  a  linear  combination  of  ?  ,  6  and  Y  , , . . . ,Y„  .  all  of 

mm  m+1  Ni 

which,  are  independent  of  a2. 

*■  xi  -  SN1  >z '  -  xm>zl3m>  2  e°- 

Proof.  Let  A=  {(xj,...,x  )  :  yn>n,  P(r(X..-X  )2>b  ?(x, -X  )2| 

m  i  l  m  n^  1  m 

Xi  =  x1,...,Xn=xm)>c},  then  P((Xx, . . .  ,Xm)  e  A)  =  g  by  our  choice  of 

b  .  Since  a  is  independent  of  the  X.,  we  also  have  P((X,,...,X  ) 

w  i  4  n 

eA|aci)  =  g.  If  (Xj, . . .  ,Xm)  =  (xi, . . .  ,xffl)  e  A  and  we  write 

ni=Ni(a  ,xi,...,x  ),  then 
m  m 

'  P(f‘<Xi  -  >2  >bn,f<Xl  -  *»>2I  VX>  ■  *l>->  V  *„> 

■  «|1<Xi-X1(i)2>bnif(Xi-Xni)2|Xl.x1,....Xln.xii,) 


Combining,  we  have  the  desired  result. 

Corollary  1.  P(  )2  >  t2^,^  tra_2Ja2/d2  |om)  >  gc. 

Proof.  This  follows  from  lemma  2  and  the  definition  of  Nj. 
We  now  prove  theorem  1. 


by  lemma  1 


■  E{2*(d/S1/o)-l|3ffi} 

2  gc{2$(  t^_a^2  tm_2]3m/a )  -  1}  by  corollary  1. 
Thus  P(  | SN x  ~  6 1  <d)^gc  E{2#(t1_a^2^m~2I®II/a)_1^ 

■  gc(l  -  o') 

*  1  -  a. 


3.  SEQUENTIAL  PROCEDURES 

If  we  sample  sequentially,  then  prediction  is  no  longer 
necessary. 

Procedure  2.  (i)  Obtain  a  pilot  sample  of  size  m.  (ii)  Sample 

sequentially  until  |(X^  -  X^)2  >  t2  a^[m-2]a^d2. 

Let  N2  ba  the  sample  size  when  we  terminate  sampling,  our  next 

theorem  asserts  that  (jL  -d,B„  +d)  is  a  (1- a) -level  confidence 

N2  N2 

interval  for  B. 

Theorem  2.  P(  | 3^  -  B|  <  d )  2  1  -  o . 

We  first  state  a  lemma. 

Lemma  3.  The  conditional  distribution  of  8.,  given  a  and  Xi,  X?, 

N?  N2  m  17  *■ 

...  is  N(B,o2/S2)  where  S2  ■  E  (X, -3L  )2. 

1  1  «2 

This  is  the  analog  of  lemma  1  and  can  be  proved  using 
similar  technique.  V.’e  now  prove  theorem  2. 

p(|3N2-s|  <d)  =  e{p(|bN2-s|  <d|3ra,x1,x2,...)} 


=  E{2$( d»S2/o )  -  1} 


by  lemma  3 


>  E{2«(t1_a/2fm-2j5m/o)-l} 

=  l-o. 

We  note  that  theorem  2  holds  even  when  the  X^  are  not  normally 


distributed. 

Since  the  estimate  of  o2  is  not  updated  as  we  sample  sequen¬ 
tially,  procedure  2  is  inefficient.  It  behaves  like  Stein's 
procedure  for  the  estimation  of  the  mean  of  a  normal  population. 

In  fact 

e :(s2)  =  sq2^-^)2) 

-  E(tl-a/2[m-2j5m/d2) 

=  S°tl-a/2tm"2I/Zl-a/2 

so  that  E(S2)/S<)  *=  ^i-o/2  ^m”2^2l-o/2  >  1* 

If  the  estimate  of  o2  is  updated  sequentially,  we  obtain  the 
following  procedure . 

Procedure  3.  (i)  Obtain  a  pilot  sample  of  size  m.  (ii)  Sample 

sequentially  until  £(  -  X^)2  -  a2S2/d2  where  {a^}  is  a  sequence 

of  constants  converging  to  2^ 

We  expect  proced’ore  3  to  be  the  most  efficient,  but  unlike 
procedures  2  and  3,  the  specified  coverage  probability  is  attained 
only  asymptotically.  Procedure  1  is  least  efficient  since  we  have 
to  deal  with  the  additional  problem  of  prediction,  however,  it  has 
the  advantage  of  requiring  only  two  sampling  operations. 

4.  A  RELATED  PROBLEM 


A  problem  related  to  fixed-width  Interval  estimation  of  8  is 
that  of  deriving  a  test  procedure  of  H  :  8  =  8o  at  level  a<j  which 
has  power  at  least  a*  at  8  =  8q  +  d,  d  >  0.  We  can  make  use  of  our 
earlier  results  to  solve  this  problem.  For  instance,  we  can  use 
procedure  2  to  obtain  a  ( 1  -  a )  -  level  confidence  interval  for  8 
with  width  2d,  d  <  A  and  reject  H  if  So  lies  outside  that  interval 
The  resulting  test  has  level  oo  and  its  power  at  8  =  8o  +  d  is 

WlVBol>d)iVa(VB'’*d> 

”  E<P60*»(®N2  *  8°  *  dl%.xl 


=  E{l-$((d-A)/S2/c)} 

>  E{1  -  $(  ( d  -  A )tx_a^^2  tm-2]om/do )}. 

If  we  choose  d  such  that  (A-d)t,  y-,[m-2]/d  =  t  [m-2] ,  then  the 

power  is  at  least  <*i.  As  expected,  if  d  =  A,  then  the  power  is  at 
least  i;  as  d-*-Oj  the  power  increases  to  1. 

The  technique  we  employ  so  far  is  to  condition  on  the  X^  and 
then  treat  them  as  if  they  are  fixed.  An  unconditional  approach 
treating  the  and  symmetrically  is  described  below.  Without 
loss  of  generality,  the  hypothesis  is  H  :  B  =  0.  Assume  that  we 
are  sampling  from  a  bivariate  normal  population 

<£)  ' 

then  H  is  equivalent  to  p  = 0  and  the  usual  t  test  rejects  H  if  |r| 
is  tco  large  where  r  is  the  sample  correlation  coefficient.  Since 
the  distribution  of  r  depends  on  the  parameters  only  through  p,  we 
can  determine  the  sample  size  such  that  the  level  -  a  test  of  p  =  0 
has  power  04  at  another  p  value.  Bock  (1977)  makes  use  of  Fisher 
Z-transformation  to  derive  an  approximate  formula  for  the  required 
sample  size 

Zl-a0/2'(n“  3)*tanh  1p  =  Zj^. 

Since  p  =  8/(l  +  92)s  where  9  =  8t/c,  the  following  procedure 
suggests  itself. 

Procedure  4.  (i)  Obtain  a  pilot  sample  of  size  m.  (ii)  Sample 

sequentially  until  -  (n  -  3  )^tanh”*p  (i)sZ,  where 

A  ^  1-04 

P_(a)  =  9  U)/(l  +  8  (a))  ,  9  (A)  =  At  /S  and 
n  ^  n  n  n  n  n 

t2  =  ^(Xj,  -  X^)2/(n  -  l).  (iii)  Perform  a  two-sided  t  test  treating 
the  final  sample  size  N(A)  as  if  it  is  fixed.  Thus  if 

V9n'?(Vy2)5/V  "e  reJect  H  lf  l'«(*)l>tl-.„/2,Kl41'21- 

The  following  theorem  asserts  that  the  test  procedure  attains 
the  specified  level  and  power  asymptotically. 

Theorem  3.  Zim  ps=o^Tn(a)^  <  *1-00/2^^  ~  25  ^  =  1  “  “O' 


P6*AC  » TN(  A )  *  *  *1^0/2  tN(  A )  -  23  )  >  ai . 

Proof,  (i)  Since  r  =  T  /(n-2  +  T2)^  where  r  is  the  sample 
n  n  n '  n 

correlation  coefficient  computed  from  (Xi,Yi),  (xn*Yn)> 

1 "  a°  "  PS=0(  lTn^  <  ^-00/2^  “  21  ^ 

-  n  l  1/ \  *  r  \ 


pg=0(  |(n-3rtanh  rn|  <CR) 


where  »  (n-  3)*tanh_1(  t^^tn  -  2]/(n-  2  +  t^_a^2tn-  2]  )*).  On 
the  other  hand,  when  6  =  0 

(n-  3)^tanh-*rn  +  ^  N(0,l)  as  n  +  ■, 

so  we  must  have  tim  C  *  Z,  Since  N(A)  -*■  •  a.s.  as  A  -*■  0, 

a*.  »  I-oq/2 

tim  =  Z^_o^^2  a-s-  ®«d  it  follows  from  a  theorem  of  Anscombe 

ft§52)  that  when  2  =  0 

(N(A)- 3)itanh“1rN(A)  -►  p  N(0,l). 

ThuS  JJjjj  P2=0(  '  TN(  A )  I  <  tl-o0/2 lU' A  *  '  23  * 

■  Wic'1)-3)W\(i)i<VA)) 

=  1  -  a0. 

(ii)  Assume  for  the  time  being  that  under  6  =  A 

(»<4>-3>it=’h"SlU)  *  „  “<z1^0/2-zi^1.l>  as  a-0,  (1) 

then  P6=A^  ^TN(a)^  >  ll-a0/2  M  6 )  -  2]  ) 

■  »  W(N(a,-3)WSu)>cNU>) 


To  prove  (l),  we  fix  y,  a,  u,  r  and  define  n(h)-3  to  be  the  least 

integer  greater  than  or  equal  to  (Z  -  Z  )2/(  tanh-1p( A )  )2 

l  X^ClQ/  fc  l-ctj 

where  p(  A  )  =  9(  A )/( 1  +  6( A )  )s  and  0(a)  =  At/o.  Under  8  =  A 

(n(A)- 3)i(tanh"1r  tanh_1p(A))  -  0  N(0,l)  as  A  +  0, 


equivalently 

( ri(  A  )  -  3 )^tanh~^r^ ^ p  -  Z^^l)  as  A-*0  (2) 

from  which  (  l)  follows  if  we  can  replace  n( A )  by  N( A ) .  To  that  end, 
we  note  that  if  X±  is  N(u,t2)  and  Y^  is  N(y,a2)  independently  of  Xi> 
then  the  conditional  distribution  of  Y^  +  given  X^  is  N(y  +  8X^, 
o2).  The  advantage  of  this  representation  is  that  it  enables  us  to 
deal  with  a  single  array  of  random  variables  rather  than  a  double 
array.  In  particular,  we  can  show  N( A )/n( A )  -*•  1  a.s.  as  A-»-0.  A 
generalization  of  Anscombe 1 s  theorem  enables  us  to  replace  n(  A )  by 
N( A )  in  (2),  we  omit  the  details. 
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